\begin{abstract}

In this assignment we use Rapidminer 5.1 to explore various techniques of
preprocessing data and linear regression modeling to accurately predict dollar
amounts in response to the 97NK mailing from the KDD Cup 1998 data. We
demonstrate how using high ridge parameters in a linear regression and their
resulting coefficients to effectively prune irrelevant features. We also show
how to use Grid Search Optimization to identify the an optimal ridge value.
After applying all of our techniques, we created a flow to process the data
and produce a model with an RMSE of $8.519$ from a 10 fold cross validation on
the learning data.

\end{abstract}
